AITopics | disentangled counterfactual learning

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning Supplementary Material Anonymous Author(s) Affiliation Address email

Neural Information Processing SystemsFeb-9-2026, 07:07:41 GMT

Moreover, we show more visualization results in experiments. To ensure a fair comparison, we used the fusion and optimization method as same as Latefusion. When k=1, it means that the object's physical properties are only related to itself, while As described in Section 3.1 in our paper, we represent audio Table 2: Performance comparison between our proposed DSE-audio and existing baseline methods. As shown in Table 2, we compare our method with other baseline methods. In Figure 6, we show a few additional examples of clustering using dynamic factors.

artificial intelligence, machine learning, reasoning supplementary material anonymous author, (13 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning Changsheng Lv

Neural Information Processing SystemsFeb-9-2026, 07:07:38 GMT

The task aims to infer objects'

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > Texas (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning

Neural Information Processing SystemsDec-24-2025, 07:33:15 GMT

In this paper, we propose a Disentangled Counterfactual Learning (DCL) approach for physical audiovisual commonsense reasoning. The task aims to infer objects' physics commonsense based on both video and audio input, with the main challenge is how to imitate the reasoning ability of humans. Most of the current methods fail to take full advantage of different characteristics in multi-modal data, and lacking causal reasoning ability in models impedes the progress of implicit physical knowledge inferring. To address these issues, our proposed DCL method decouples videos into static (time-invariant) and dynamic (time-varying) factors in the latent space by the disentangled sequential encoder, which adopts a variational autoencoder (VAE) to maximize the mutual information with a contrastive loss function. Furthermore, we introduce a counterfactual learning module to augment the model's reasoning ability by modeling physical knowledge relationships among different objects under counterfactual intervention. Our proposed method is a plug-and-play module that can be incorporated into any baseline. In experiments, we show that our proposed method improves baseline methods and achieves state-of-the-art performance. Our source code is available at https://github.com/Andy20178/DCL.

disentangled counterfactual learning, name change, physical audiovisual commonsense reasoning, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.63)
Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning Supplementary Material Anonymous Author(s) Affiliation Address email

Neural Information Processing SystemsOct-8-2025, 08:02:28 GMT

Moreover, we show more visualization results in experiments. To ensure a fair comparison, we used the fusion and optimization method as same as Latefusion. When k=1, it means that the object's physical properties are only related to itself, while As described in Section 3.1 in our paper, we represent audio Table 2: Performance comparison between our proposed DSE-audio and existing baseline methods. As shown in Table 2, we compare our method with other baseline methods. In Figure 6, we show a few additional examples of clustering using dynamic factors.

disentangled counterfactual learning, physical property, reasoning supplementary material anonymous author, (10 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Israel (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning

Neural Information Processing SystemsOct-10-2024, 15:53:16 GMT

In this paper, we propose a Disentangled Counterfactual Learning (DCL) approach for physical audiovisual commonsense reasoning. The task aims to infer objects' physics commonsense based on both video and audio input, with the main challenge is how to imitate the reasoning ability of humans. Most of the current methods fail to take full advantage of different characteristics in multi-modal data, and lacking causal reasoning ability in models impedes the progress of implicit physical knowledge inferring. To address these issues, our proposed DCL method decouples videos into static (time-invariant) and dynamic (time-varying) factors in the latent space by the disentangled sequential encoder, which adopts a variational autoencoder (VAE) to maximize the mutual information with a contrastive loss function. Furthermore, we introduce a counterfactual learning module to augment the model's reasoning ability by modeling physical knowledge relationships among different objects under counterfactual intervention.

disentangled counterfactual learning, physical audiovisual commonsense reasoning, reasoning ability, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.65)

Add feedback

Filters

Collaborating Authors

disentangled counterfactual learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning Supplementary Material Anonymous Author(s) Affiliation Address email

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning Changsheng Lv

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning Supplementary Material Anonymous Author(s) Affiliation Address email

Disentangled Counterfactual Learning for Physical Audiovisual Commonsense Reasoning